AITopics | weight consolidation

Neural Information Processing Systems http://nips.cc/

dataset, neural network, segmentation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Health & Medicine > Diagnostic Medicine > Imaging (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
(2 more...)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

Neural Information Processing SystemsNov-20-2025, 21:43:23 GMT

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.

brain segmentation case study, name change, weight consolidation, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

Patrick McClure, Charles Y. Zheng, Jakub Kaczmarzyk, John Rogers-Lee, Satra Ghosh, Dylan Nielson, Peter A. Bandettini, Francisco Pereira

Neural Information Processing SystemsNov-20-2025, 14:13:37 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Health & Medicine > Diagnostic Medicine > Imaging (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
(2 more...)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

Neural Information Processing SystemsOct-8-2024, 14:34:33 GMT

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.

brain segmentation case study, neural network, weight consolidation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Reviews: Distributed Weight Consolidation: A Brain Segmentation Case Study

Neural Information Processing SystemsOct-7-2024, 04:31:25 GMT

The paper proposes a technique for learning a model by consolidating weights across models that are trained in different datasets. The proposed approach thus attempts to solve an important problem that arises by the limitations of sharing and pooling data. The authors take on the brain segmentation problem by using MeshNet architectures. The proposed method essentially starts from the model learned from one dataset, performs variational continual learning to parallel train across multiple datasets, and then performs bayesian parallel learning to fine tune the model on a dataset by using as prior the weights learned in parallel from the rest of the datasets. The proposed approach has been tested using free surfer segmentations for data part of the Human Connectome Project, the Nathan Kline Institute, the Buckner Lab and the ABIDE project.

brain segmentation case study, dataset, weight consolidation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

EVCL: Elastic Variational Continual Learning with Weight Consolidation

Batra, Hunar, Clark, Ronald

arXiv.org Machine LearningJun-22-2024

Continual learning aims to allow models to learn new tasks without forgetting what has been learned before. This work introduces Elastic Variational Continual Learning with Weight Consolidation (EVCL), a novel hybrid model that integrates the variational posterior approximation mechanism of Variational Continual Learning (VCL) with the regularization-based parameter-protection strategy of Elastic Weight Consolidation (EWC). By combining the strengths of both methods, EVCL effectively mitigates catastrophic forgetting and enables better capture of dependencies between model parameters and task-specific data. Evaluated on five discriminative tasks, EVCL consistently outperforms existing baselines in both domain-incremental and task-incremental learning scenarios for deep discriminative models.

elastic variational continual learning, variational continual learning, vcl, (10 more...)

arXiv.org Machine Learning

2406.15972

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

SL: Stable Learning in Source-Free Domain Adaption for Medical Image Segmentation

Chen, Yixin, Wang, Yan

arXiv.org Artificial IntelligenceJul-24-2023

Deep learning techniques for medical image analysis usually suffer from the domain shift between source and target data. Most existing works focus on unsupervised domain adaptation (UDA). However, in practical applications, privacy issues are much more severe. For example, the data of different hospitals have domain shifts due to equipment problems, and data of the two domains cannot be available simultaneously because of privacy. In this challenge defined as Source-Free UDA, the previous UDA medical methods are limited. Although a variety of medical source-free unsupervised domain adaption (MSFUDA) methods have been proposed, we found they fall into an over-fitting dilemma called "longer training, worse performance." Therefore, we propose the Stable Learning (SL) strategy to address the dilemma. SL is a scalable method and can be integrated with other research, which consists of Weight Consolidation and Entropy Increase. First, we apply Weight Consolidation to retain domain-invariant knowledge and then we design Entropy Increase to avoid over-learning. Comparative experiments prove the effectiveness of SL. We also have done extensive ablation experiments. Besides, We will release codes including a variety of MSFUDA methods.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2307.1258

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adapting machine translation models to new genres

#artificialintelligenceNov-8-2021, 15:40:42 GMT

Neural machine translation systems are often optimized to perform well for specific text genres or domains, such as newspaper articles, user manuals, or customer support chats. In industrial settings with hundreds of language pairs to serve, however, a single translation system per language pair, which performs well across different text domains, is more efficient to deploy and maintain. Additionally, service providers may not know in advance which domains customers will be interested in. At this year's Conference on Empirical Methods in Natural Language Processing (EMNLP), we are presenting a new approach to multidomain adaptation for neural translation models, or adapting an existing model to new domains while maintaining translation quality in the original domain. Our approach provides a better trade-off between performance on old and new tasks than its predecessors do.

news article, translation model, translation system, (16 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Synaptic Metaplasticity in Binarized Neural Networks

Laborieux, Axel, Ernoult, Maxence, Hirtzlin, Tifenn, Querlioz, Damien

arXiv.org Machine LearningMar-7-2020

While deep neural networks have surpassed human performance in multiple situations, they are prone to catastrophic forgetting: upon training a new task, they rapidly forget previously learned ones. Neuroscience studies, based on idealized tasks, suggest that in the brain, synapses overcome this issue by adjusting their plasticity depending on their past history. However, such "metaplastic" behaviour has never been leveraged to mitigate catastrophic forgetting in deep neural networks. In this work, we highlight a connection between metaplasticity models and the training process of binarized neural networks, a low-precision version of deep neural networks. Building on this idea, we propose and demonstrate experimentally, in situations of multitask and stream learning, a training technique that prevents catastrophic forgetting without needing previously presented data, nor formal boundaries between datasets. We support our approach with a theoretical analysis on a tractable task. This work bridges computational neuroscience and deep learning, and presents significant assets for future embedded and neuromorphic systems.

binarized neural network, dataset, neural network, (15 more...)

arXiv.org Machine Learning

2003.03533

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

McClure, Patrick, Zheng, Charles Y., Kaczmarzyk, Jakub, Rogers-Lee, John, Ghosh, Satra, Nielson, Dylan, Bandettini, Peter A., Pereira, Francisco

Neural Information Processing SystemsFeb-14-2020, 13:58:33 GMT

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites.

brain segmentation case study, neural network, weight consolidation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback